An Empirical Comparison of Exact Nearest Neighbour Algorithms

نویسندگان

Ashraf M. Kibriya

Eibe Frank

چکیده

Nearest neighbour search (NNS) is an old problem that is of practical importance in a number of fields. It involves finding, for a given point q, called the query, one or more points from a given set of points that are nearest to the query q. Since the initial inception of the problem a great number of algorithms and techniques have been proposed for its solution. However, it remains the case that many of the proposed algorithms have not been compared against each other on a wide variety of datasets. This research attempts to fill this gap to some extent by presenting a detailed empirical comparison of three prominent data structures for exact NNS: KD-Trees, Metric Trees, and Cover Trees. Our results suggest that there is generally little gain in using Metric Trees or Cover Trees instead of KD-Trees for the standard NNS problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scaling up the Accuracy of K -nearest-neighbour Classifiers: a Naive-bayes Hybrid

k-nearest-neighbour (KNN) has been widely used as an effective classification model. In this paper, we summarize three main shortcomings confronting KNN and then single out three categories of approaches for overcoming its three main shortcomings. After reviewing some algorithms in each category, we presented a hybrid algorithm called dynamic k-nearest-neighbour naive Bayes with attribute weigh...

متن کامل

Limit theory for the random on-line nearest-neighbor graph

In the on-line nearest-neighbour graph (ONG), each point after the first in a sequence of points in R is joined by an edge to its nearest-neighbour amongst those points that precede it in the sequence. We study the large-sample asymptotic behaviour of the total power-weighted length of the ONG on uniform random points in (0, 1)d. In particular, for d = 1 and weight exponent α > 1/2, the limitin...

متن کامل

Personal Credit Score Prediction using Data Mining Algorithms (Case Study: Bank Customers)

Knowledge and information extraction from data is an age-old concept in scientific studies. In industrial decision-making processes, the application of this concept gives rise to data-mining opportunities. Personal credit scoring is an ever-vital tool for banking systems in order to manage and minimize the inherent risks of the financial sector, thus, the design and improvement of credit scorin...

متن کامل

A comparison of k-nearest neighbour algorithms with performance results on speech data

The (k-)nearest neighbour problem is well known in a wide range of areas. Many algorithms to tackle this problem suffer from the “curse of dimensionality” which means that the execution time grows exponentially with increasing dimension. Therefore, it is important to have efficient algorithms for the problem. In this report, some well known tree-based algorithms for the k-nearest neighbour are ...

متن کامل

Optimal weighted nearest neighbour classifiers

We derive an asymptotic expansion for the excess risk (regret) of a weighted nearest-neighbour classifier. This allows us to find the asymptotically optimal vector of nonnegative weights, which has a rather simple form. We show that the ratio of the regret of this classifier to that of an unweighted k-nearest neighbour classifier depends asymptotically only on the dimension d of the feature vec...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

An Empirical Comparison of Exact Nearest Neighbour Algorithms

نویسندگان

چکیده

منابع مشابه

Scaling up the Accuracy of K -nearest-neighbour Classifiers: a Naive-bayes Hybrid

Limit theory for the random on-line nearest-neighbor graph

Personal Credit Score Prediction using Data Mining Algorithms (Case Study: Bank Customers)

A comparison of k-nearest neighbour algorithms with performance results on speech data

Optimal weighted nearest neighbour classifiers

عنوان ژورنال:

اشتراک گذاری